Appendix 1: Full NotebookLM Output

Raw Output from NotebookLM

This is the full transcript excerpt that NotebookLM extracted from the AI Snake Oil video:

`AI Snake Oil: Hype and RealityHere is the corrected transcript:{ NEW YOUTUBE SOURCE Excerpts from the transcript of the video "AI Snake Oil: What Artificial Intelligence Can Do, What It Can't, and How to Tell the Difference" uploaded on the YouTube channel "MIT Shaping the Future of Work Initiative":[1] ASU OZDAGLAR: Maybe we should get started, right? Hi, everyone. It's a pleasure to welcome you all to tonight's talk with Professor Arvind Narayanan. The Schwarzman College of Computing is honored to co-host this event with MIT's Shaping the Future of Work Initiative. We're excited to have this unique convergence of minds and missions at the intersection of technology, society, and future of work. We're honored to be joined by Professor Arvind Narayanan from Princeton, also the co-author of the book "AI Snake Oil."[2] At such a critical time, when there's so much debate and discussion around the promise and peril of AI, with many people focusing on existential risk, Arvind and Sayash's book brings a breath of fresh air and provides a balanced perspective on how we can navigate the hype and reality of AI. I personally recommend this book to everyone. Arvind in the book draws a parallel, a very effective parallel with snake oil, whose sellers promise miracle cures with false pretenses, sometimes ineffective but harmless,[3] but in other cases harms extending to loss of health or life-- very similar to AI. AI snake oil is AI that does not and cannot work. And the goal of the book is to identify AI snake oil and to distinguish it from places where AI can work very effectively, especially in high-stakes settings such as hiring, health care, and justice. I'm thrilled to represent the Schwarzman College of Computing as the deputy dean of academics. And our dean, Dan Huttenlocher, is also here with us tonight. And it's truly a pleasure to be here[4] with dynamic leaders of the Shaping the Future of Work Initiative, Daron Acemoglu and David Autor. DARON ACEMOGLU: Simon's not here. PRESENTER: Simon's not here. And also an effective, dynamic leader, Simon Johnson, who couldn't join us today. Shaping the Future of Work brings an evidence-based lens to economic and policy impacts of automation. And the Schwarzman College is reimagining how we do research and teach computing with social implications at our core. What unites these efforts and why we're so excited to have Arvind here tonight[5] is a shared commitment to clarity, rigor, and technical expertise in how AI technology is developed and deployed. Tonight's presentation and conversation promises to enlighten us, make us think about these important issues. And with that, please join me in welcoming Professor Daron Acemoglu from the Department of Economics, institute professor and faculty co-director of Shaping the Future of Work Initiative. DARON ACEMOGLU: Thank you, Alison. I don't need that. I have the lapel mic. Thank you. [APPLAUSE] Thank you very much.[6] Thank you, Alison, and thank you for everybody for being here. This is a great event, and I'm delighted that people have recognized it as a great event and filled it here. I want to say just two more words about the initiative for Shaping the Future of Work, which is co-led by myself, David Autor, and Simon Johnson, who unfortunately couldn't be here. And part of the reason why I want to say that is because I want to emphasize how synergistic Arvind's agenda is to what we want to do. We've launched this initiative because we're[7] worried about the future of work, the future of inequality, the future of productivity in the age of digital technologies and AI. And part of the reason we are concerned is precisely about how AI and other technologies are going to be used. And perspective, as the word "shaping" suggests, is one in which we argue that the future of these technologies is not given. It's not preordained, but different technologies have different consequences. And we want to understand those consequences, and we want to steer technology via variety of channels,[8] mostly coming from the academic research we're doing, and our collaborators are doing, and our affiliates are doing towards the more socially beneficial directions. And I think I cannot imagine somebody better than Arvind to actually give much greater depth and breadth to this, because Arvind is a professor of economics at Princeton-- professor of computer science, although we could have you as a professor of economics as well, Arvind. I think it's fair game. Professor of computer science at Princeton and the director of the Center for Information Technology[9] and Policy is bringing, even without the book, a unique perspective, great technical expertise, but a very clear-eyed and deep understanding of many applications of AI. And that is exactly the space where we need to be-- not excessive optimism, not excessive pessimism, but understanding what are the things that AI can do productively, what are the things it cannot do at the moment, perhaps never, and what are the things that it can do but are not going to be great? So Arvind's book, "AI Snake Oil," which you're going to hear about,[10] is full of amazing insights ranging from predictive AI to generative AI, large language models to social media to machine learning and the mistakes you can make with machine learning. I think we're going to get a glimpse of many of these excellent points and, hopefully a lot of food for thought for everybody. Arvind's going to speak for 20, 25 minutes, and then we're going to have a little bit of a conversation for 15 minutes or so. And then we're going to open it up for Q&A. So please give a warm welcome to Arvind.[11] And we're really delighted to have him here. [APPLAUSE] ARVIND NARAYANAN: Hello, everybody. Thank you, Daron, and Asu for such kind words. It's really my pleasure to be here today. And I really mean it because the origin story of this book is actually right here at MIT. So let me tell you how that happened. This was way back in 2019, when I kept seeing hiring automation software. And the pitch of these AI companies to HR departments was, look, you're getting hundreds of applications, maybe 1,000 for each open position.[12] You can't possibly manually review all of them. So use our AI software and ask your candidates to record a video of themselves speaking for 30 seconds, not even about their job qualifications but about their hobbies or whatever. And this is from the promotional materials of an actual company. And the pitch was that our AI will analyze that video and look at the body language, speech patterns, things like that, in order to be able to figure out their personality and their suitability for your particular job.[13] And you can see here this software has characterized this person on multiple dimensions of personality. That's only one of five tabs. And on the top right, they have been characterized as a change agent. And their score is 8.982 digits of precision. That's how you know it's AI. That's how you know it's accurate. And it didn't seem to me that there is any known way by which this could possibly work. And sure enough, now, six years later, none of these companies have released a shred of evidence that this can actually predict someone's job performance.[14] And in the few instances that journalists have been able to use creative methods to try to see if these techniques work or not, here's the kind of thing that they have found. So here was an investigative journalist who uploaded a video, two copies of a video. And in one case, they digitally added a bookshelf in the background. And they also tried changing in another experiment glasses versus no glasses. Radically different scores. Right? So I didn't have this evidence back then, but this is what I suspected.[15] And coincidentally, at that time, I was invited to give a talk here. And I gave a talk called "How to Recognize AI Snake Oil." And I said, look, there are many kinds of AI, some things like generative AI, which wasn't called generative AI back then. Those are making rapid progress. They work well, but there are also claims being made like this. I called it an elaborate random number generator, and people seemed to like that talk. So I put the slides online the next day. I thought 20 of my colleagues would look at it.[16] But in fact, the slides went viral, which I didn't know was a thing that could happen with academic work. And I realized it wasn't because I had said something profound but because we suspect that a lot of the AI-related claims being made are not necessarily true. But these are being made by trillion-dollar companies and supposed geniuses. So we don't feel like we necessarily have the confidence to call it out. And so when I was able to say, look, I'm a computer science professor, I study AI, I build AI, and I can tell you that some of these claims[17] aren't backed by evidence, that seemed to resonate with a lot of people. And within a couple of days, I had like 30 or 40 invitations to turn that talk into an article or even a book. I really wanted to write that book. But I didn't feel ready because I knew that there was a lot of research to be done in presenting a more rigorous framework to understand when AI works and when it doesn't. And so that's when Sayash Kapoor joined me as a graduate student. So we did about five years of research. And the book is a summary and a synthesis[18] of that research, some of which we've also published in the form of a series of papers leading up to that. So let me just take the next 15 minutes or so to give you some of the main ideas from the book. The starting point of the book is to recognize that AI is not one single technology. It's an umbrella term for a set of technologies that are only loosely related to each other. This is ChatGPT. I don't need to tell you what it is. But on the other hand, technology that banks might use in order to classify someone's credit risk[19] is also called AI. There's a reason they're both called AI. They're both forms of learning from data. But in all the ways that matter, in how the technology works, what the application is, and, most importantly, how it might fail and what the consequences are, these two things couldn't be more different from each other. So the thing on the right is an example of what we call predictive AI. And what all predictive AI applications have in common is a certain logic for decision-making. It's ways of making decisions about people, often[20] very consequential decisions, based on a prediction of what they will do in the future or what will happen to them in the future. And that decision is made using machine learning based on data from past similar people. So this is used in hiring, and there the logic is who will do well at a job. It's used in lending. There the logic is who might pay back a loan or not. It's used in criminal justice, and there the logic is who might commit a crime or not commit a crime. It's used in health care. It's used in education.[21] Ever-expanding set of domains. And predictive AI is something we're very dubious about. And I'll come back to that in a second. And then, of course, there's generative AI. In addition to generating text, there's an ever-expanding variety of things that it can do. We also talk a lot in the book about social media algorithms and what are some of the societal-scale risks that can arise out of that, as opposed to discrete risks to particular individuals. And we talk a little bit about self-driving cars and robotics,[22] which we'll come back to in a few minutes. So why are we so skeptical about predictive AI? If we look at the criminal justice example, for instance, in the majority of jurisdictions today, when someone is arrested, when the judge faces a decision, there's months or maybe years until their trial. Should that person spend that time in jail, or should they be free to go, or should there be an ankle monitor or any number of options for release? That decision is made or guided by an algorithmic system and automated decision-making system, or at least[23] decision recommendation system. It's a statistical learning system. You could call it AI. It's something that falls under the umbrella of what we call predictive AI. And the problems with this have been known for a long time. In 2016, there was this well-known investigation by ProPublica called "Machine Bias," where they did a Freedom of Information Act request. These companies are notoriously secretive. They managed to get a lot of data. And they showed that the false positive rate for the particular algorithm that they studied[24] was twice as high for Black defendants as it was for white defendants. And so we've known about these problems with racial bias. But when I looked at that study, there was one thread in it that I felt didn't get picked up nearly enough, which is that the predictive accuracy of these methods is not really that high. So if you know about how accuracy is measured in machine learning, AUC is often used, Area Under the Curve. And the best numbers that you can get here are less than 70%. And 50% is random guessing, right?[25] So we're making decisions about someone's freedom based on something that's only slightly more accurate than the flip of a coin. And the vast majority of people who are predicted to be at high risk do not, in fact, go on to commit another crime. So we felt that how is it ethical to use this for anybody, whether it's a Black defendant or a white defendant or anyone else? So that is one of the main points we make in the book and also in papers leading up to that, that it's hard to predict the future.[26] We just don't know who's going to commit a crime in the future. And so we shouldn't so easily accept this idea of pre-crime of determining someone's fate based on a prediction of a crime they will commit in the future, as opposed to a determination of guilt. OK. So let's talk about generative AI. There's a lot more to say about predictive AI. But maybe we can save that for the conversation and for the Q&A. Generative AI, of course-- in addition to text, it can generate any one of a number of things. Look, there are limitations.[27] There's a lot of hype, and I'll talk about some of the downsides in a second. But we're also very clear in the book that generative AI is useful to basically every knowledge worker, anyone who thinks for a living. And I'm sure we'll talk about the labor implications. But I also wanted to emphasize for a second that a big aspect of it is that it's a technology a lot of the time that's just very fun to use. And I just wanted to keep that in the conversation because that is often easily forgotten when we're talking about these serious aspects of AI.[28] And in my own personal use of AI, I certainly use it for my research but quite a bit in my personal life as well. I have two young kids, and I often find myself using an AI in ways that really enrich our relationship when I'm spending time with my kids. So the other day, for instance, I was teaching my daughter fractions. And it's hard for a kid to understand the idea of fractions. So I pulled out an AI app on my phone. And these days you can produce a little app that's created on the spot by AI based on a text description[29] of what you want the app to do. So I asked the AI agent to create an app to visualize fractions, and it made this little game. It's a little slider, and it generates a random fraction and asks the child to try to guess where it goes on the line. And once they guess, it will check that guess, and it will divide the line into many parts, visualize what one third looks like, give a score and keep score, and keep generating new fractions. So we played with this for 15 minutes, and it really helped her. And I've done this with all kinds of things,[30] generating random clock faces as a way to teach her to tell time, et cetera. And what's really cool about this is that you make this app once. You play with it. My child doesn't have a huge attention span. It's useful for 15 minutes, and then it's done. You throw it away, right? And that's amazing. You couldn't have imagined doing this a couple of years ago because it would have taken, at a minimum, several hours to create an app. OK. So with that said, we are also critical in the book about the generative AI industry's irresponsible release[31] practices. And there are, of course, many harmful consequences of this. And as we say in the book, it's like everyone in the world has been simultaneously given the equivalent of a free buzz saw. There are AI-generated books on Amazon by people just trying to make a dollar. In some cases, it's just an annoyance. Maybe you lost $0.99, which is often what these things are sold for because you unknowingly bought an AI-generated book. But in some cases, there are things like foraging guides for mushrooms generated[32] by AI, full of hallucinations. And so those can have life-or-death consequences. And there have been, in many cases, life-or-death consequences, people developing companionships with their AI bots and chat bot encouraging their suicidal tendencies and that sort of thing. And the biggest one in our mind is these AI nude apps, which you've probably heard about. It's been an epidemic in so many countries around the world, especially in high schools. And these are apps that can take a picture of a person[33] and create a nude image of that person based on the photo that you upload. And this has affected hundreds of thousands of, obviously, primarily women around the world. And not only AI companies but also policymakers being so slow to recognize this problem and doing something about it has been a real shame. Since 2019, since long before the latest wave of generative AI advancements, we've had evidence that this is a problem that's happening on a massive scale.[34] is the labor that goes into making these large-scale generative AI models. Yes, they're trained on data from the internet, but they're also post-trained, as it's called, based on human interaction. And there is a lot of human annotation work that is necessary to essentially clean the training data, if you will, that goes into making these models. And this work is offshored to developing countries. It's trauma-inducing work because day in and day out, you have to look at videos of beheadings or racist diatribes[35] or whatever and make sure that doesn't get into the input or output of AI. And the working conditions are so precarious that a lot of these AI companies have often turned to people who don't have a lot of options in the labor market, like refugees or people in countries experiencing hyperinflation or prison labor and so forth. So something is clearly wrong here, and we need new labor rights for this. So having been critical of companies, I do want to say that it's not about putting all of the blame on the companies.[36] There's a lot of personal agency that all of us need to exercise. And we need to use judgment in knowing when the use of AI is even appropriate, which is separate from the issue of whether it works or not. A good example of this comes from the recent election. This example is not in our book because it's pretty recent. But there was a candidate in Cheyenne, Wyoming, who wanted the mayor to be an AI chatbot. And he said that if he were elected, all the decisions would be made using this chatbot. As far as I can tell, it was only ChatGPT behind the scenes.[37] But he called it VIC for Virtual Integrated Citizen, which certainly sounds more sophisticated. Yeah. I learned about this because "The Washington Post" called me to ask, what are the risks of having an AI mayor? [LAUGHTER] I was very confused by that question. And I kind of blurted out, what do you mean risks of having an AI mayor? It's like asking, what are the risks of replacing a car with a cardboard cutout of a car? I mean, sure, it looks like a car, but the risk is that you don't have a car anymore. I regretted it as soon as I said it.[38] It was a little bit snarky, but "The Post" printed it anyway. So let me explain what I mean. I mean, so his point was that politics is very messy and inefficient, a lot of fighting, et cetera. Let's make it more efficient with chatbots, but that completely misses the point. The reason politics is messy is that that's the forum we've chosen for resolving our deepest disputes. And to try to automate that is to miss the very point. So this is more or less one of the last things I want to say. I'm going to wrap up in a few minutes.[39] But this is kind of the framework we use in the book for thinking about how we should look at any particular AI application. It's a two-dimensional figure. On one dimension, you have how well does it work. Does it work as claimed, or is it overhyped, or does it not work at all? And is it a kind of snake oil? But on the other dimension, we have the fact that AI can be harmful because it doesn't work as claimed and it's snake oil, or it can actually be harmful because it works well and it works exactly as claimed.[40] So let me give you examples of each of those kinds of things. So let's start with the top right here. I mentioned those video interviews. I mentioned criminal risk prediction. Cheating detection is, of course, when professors suspect that students are using AI. They might turn to these cheating detection tools, but they just don't work, at least as of today. And they're more likely to flag non-native English speakers. And I've heard so many horror stories of students being falsely accused. As things stand today, that very much feels like snake oil to me.[41] But on the bottom right, though, are things like mass surveillance using facial recognition. Historically, facial recognition hasn't worked that well, but now it works really, really well. And that, in fact, is part of the reason that it's harmful if it's used without the right guardrails and civil liberties and so forth. Then we talk about content moderation, which we explain in what way it's overhyped. But basically our interest in the book is everything except the bottom left. Those are applications-- simple things, for instance,[42] like autocomplete that kind of fade into the background and really work well. And our goal is to have an intervention so we can equip people to push back against AI that is problematic. You wouldn't want to read a book that is 300 pages on the virtues of autocomplete. And I say that because I think that bottom left corner is very important. There's more in that corner than we might suspect. And to explain that, let me give you a funny definition of what AI is. And this definition says AI is whatever hasn't been done yet.[43] AI is whatever hasn't been done yet. So what does that mean? What it means is that when a technology is new, when its effects are double-edged, when it doesn't work that well, that's when we're more likely to call it AI. When it starts working well, it's reliable. It kind of fades into the background. We take it for granted. We don't call it AI anymore. And this has happened over and over with many kinds of automation, Roomba and other robotic vacuum cleaners. I mentioned autocomplete, handwriting recognition,[44] speech recognition, which I'm sure many of us use on a daily basis to transcribe. And even spell check was at one point a cutting-edge example of AI. So this is the kind of AI we want more of. We want technology that's reliable, that does one thing, does it well, and kind of fades into the background. So that's something that we hope that our critical approach can nudge the industry towards. And our optimistic prediction about AI is that one day much of what we call AI today will fade into the background but certainly not all of it.[45] So, for instance, not criminal risk scoring. There are intrinsic normative questions that won't go away with making the technology more reliable. But self-driving cars-- although they are in the news today, often for the wrong reasons, because of accidents and so forth, it is going to be the case that those are solvable engineering problems. There has already been dramatic progress in solving them. And one day these things are going to be widely used. They're going to become part of our physical infrastructure.[46] We'll take them for granted. And the word "car" one day will just mean self-driving car. Right? And we're going to need a new name for what we call cars today, like maybe manual car or something. And there are some downsides there. We have to think about the labor implications of gig workers and so forth. But ultimately it will have been a good thing because there are one million deaths from auto accidents every year. So, again, that's our vision for a positive kind of AI.[47] I think we need to change in terms of shaping AI for the better, there are many different recommendations in the book. But I would cluster them into three big areas. One is we need to know which applications are just inherently harmful or overhyped and we probably should not even deploy. And secondly, even when it does make sense to deploy an AI application, there are often so many risks. And we need guardrails for those. And the third one is more structural. It's really about the fact that AI is exacerbating some of the inherent capitalistic[48] inequalities that we see in our society. So how do we limit companies' power and redistribute AI benefits? Let me take one last minute to tell you about a paper that we released just a couple of days ago, which is a follow-up to "AI Snake Oil." "AI Snake Oil" looks at what's going wrong with AI today and how do we fix it. Our new paper, it's called "AI as Normal Technology." And it's kind of a vision for AI over the next maybe 20 years. It's taking a longer-term look. And it's trying to give a framework for thinking about AI that's an alternative[49] to the major narratives that we have today. There are three major narratives about AI today. The first one is that it's superintelligence that will usher in a utopia. The second one is closely related. It's a superintelligence, but it will doom us rather than benefit us. And the third one is that we should be very skeptical about AI. It's just a fad. It's so overhyped. It's going to pass very soon. And our approach in "AI Snake Oil" is a middle ground. It doesn't fit into one of these narratives. But these three narratives are so compelling that we're often[50] thought of as saying that AI is a fad that's soon going to pass. That's not what we're saying, but especially in the new paper, we're making that very concrete. We're giving a fourth alternative way to think about AI. And this is closely modeled on what we know from past technological revolutions like the Industrial Revolution, like electricity, like the internet. We do think AI is going to have transformative effects. But we think they're going to unfold over a period of many decades, as opposed to suddenly it's[51] going to have both good and bad effects. We think a lot of the superintelligence and catastrophic risks have been greatly exaggerated. We think that we're already in a good place to know how to address some of those risks if they do come up. And on the basis of all of this, we have some policy ideas for steering and shaping AI in a more positive direction. So I'll stop here, and I really look forward to the conversation. Thank you so much. [APPLAUSE] DARON ACEMOGLU: I'll do it this way. ARVIND NARAYANAN: These are fancy chairs.[52] DARON ACEMOGLU: Yeah, as long as I don't fall off them. [CHUCKLING] All right. That's fantastic, Arvind. Thank you for giving a very, very succinct but very effective summary of the book. So I want to start from the predictive AI part. So I think that was one of the items in the book that I thought was super interesting and super revealing. But I want to dig a little bit deeper and understand where the more foundational concerns about predictive AI are coming from. And, I guess, as an economist, perhaps one place[53] one could start is by distinguishing one-person decision problems or one-person interaction problems versus social problems in which there is an interaction. So if you, for example, build an AI tool for a runner to decide when, say, for example, she's going to need more liquids or when she's had enough with the running or something like that. That's sort of a predictive tool, but it doesn't have this social interaction aspect. It still has the human agency. So you might say, well, human agency means we can never predict anything.[54] But I think I'm not sure whether you want to go there, or is it more of in these game theoretic situations where what I do will depend on what others will do? There are these complex interactions that we don't understand. And, I guess, if it's the latter, is there a way to, for example, cut some of the more complex things into smaller pieces and make some progress? Like, for example, I don't see Sendhil here. But Sendhil's very interesting work on bail--[55] that's a social problem. But could we reduce it with the right guardrails to a decision problem for a judge? My guess is it's not going to be easy, but perhaps. ARVIND NARAYANAN: Yeah. Thank you. That's a fantastic question. And, yes, Sendhil and others have a great paper, "Prediction Policy Problems," that lays out their vision for how to use what we call predictive AI for various things such as bail and other things. And I was mentioning that our work is based on some papers we've written. The main one on this particular question is called "Against Predictive Optimization," which is sort of intended[56] as a counterpoint specifically to the prediction policy problems paper. And to your point exactly that if we were using AI to predict for a runner when they might need fluids or whatever, certainly doesn't raise these concerns at all. Yes, it is something about the social nature of it. Specifically, it's about the fact that an entity with power is exercising that power over an individual. And there I think we need to go beyond concerns of accuracy, and economic efficiency, and so forth and ask from a philosophical perspective,[57] when is this exercise of power justified? So in our paper, we actually did engage with philosophers a little bit. And we read that literature and tried to connect it with these more concrete AI questions. So while we do say that the accuracy is very poor, how can we do this when it's only slightly more accurate than the flip of a coin? Even if the accuracy was much better at a fundamental normative level, we do think, again-- don't want to go into too much detail for reasons we talk about in the paper-- when the relationship is[58] that of exercising power over someone, there are more considerations that come into play. DARON ACEMOGLU: I think that's a very important point that many of these things-- I think the statement that technology is never neutral-- that's like now part of the folklore. But it's not that it's neutral. New technologies really change the power balance, especially with large corporations, which is, I guess, a good segue to my second question, which is generative AI. So I thought the generative AI discussion was very interesting as well.[59] So one take, which I think is sort of close to my view, is there are a lot of very exciting capabilities of generative AI, but there aren't that many applications. ARVIND NARAYANAN: Mhm. DARON ACEMOGLU: And I think I don't see pretty much any applications except in a very few areas like programming subroutines, et cetera, which is really going to change the production structure yet. I think that's not exactly the way you put it, but I think it's similar too. So is there a fundamental reason for that, or is this just a passing phase?[60] ARVIND NARAYANAN: Yeah. So I completely agree with you that that's the state of things right now. Where we might disagree maybe a little bit is that I'm not so sure it's fundamental. I think it can change in the future, and it's already changing now. And let me explain what I mean. And this is a big part of what we get into in the "AI as Normal Technology" paper. When ChatGPT came out, the fact that it was so general-purpose that you could make it do different tasks simply by changing the prompt really misled,[61] I think, not just a lot of users but also the companies themselves, from talking to many developers in the AI industry, into thinking that this was a new paradigm of software development, that this had obviated the need for building software to do specific things for you, software in the legal sector, or software for helping you with your writing, or whatever it is. Then you can use these general-purpose models. And going forward, all that was going to be needed is prompting. And that approach was tried for a year or two[62] and has miserably failed. And we analyze in our newsletter-- for instance, we have an "AI Snake Oil" newsletter, where we use the foundational approach in the book to analyze ongoing developments why many products that were simple wrappers around large language models and tried to actually get them to do useful things in the real world instead of simply spitting out text. Those have been pretty bad failures. So there was a device called the Rabbit. Do folks remember this? And then there was Marques Brownlee's review[63] and saying it was the worst thing he'd ever reviewed. And there was a little bit of a scandal about that and so forth. So that is exactly an example of what you pointed out, which is that the capability is there. These large-language-model-based agents are capable of doing very interesting things like navigating a website, doing shopping for you. But the thing is, because they haven't developed products around them and gotten the reliability rate up from, let's say, like 80%, where it is now, all the way[64] to 99.99%, which is something we expect any software product to have. These are pretty much dead on arrival. People are complaining that it ordered their products to the wrong address. Right? Would you use a product a second time if it did that? So that's an example of where companies have dropped the ball in translating those capabilities into products. They are changing their approach now. I think there's a very good chance that we're going to see a mushrooming of products in the few years coming up. DARON ACEMOGLU: I definitely didn't[65] mean to suggest that it was fundamental. But I think I'm also not convinced it's going to go away very quickly. I guess one reason which is different from the ones that you've articulated-- so let me try that on you-- is every job is a very complex bundle of tasks. ARVIND NARAYANAN: Mhm. DARON ACEMOGLU: And the way that we've done automation in the past is that we've done careful or semi-careful division of labor so that certain tasks can be separated. Other complementary tasks can be performed by humans, or organizations can be changed.[66] It requires a lot of specific knowledge for the occupation, for the industry, a lot of tacit knowledge. And I think the approach of the leading AI companies has been, well, we're going to go to AGI or very close to AGI. Everything can be done. So we don't need any of this tacit knowledge. So we're just going to throw these foundation models, and they're going to do it. And I think that's never going to work because even with very fast advances in foundation models, a foundation model is not going to be able to do everything[67] that even an auditor does. And when you go to an educator or to a health professional, I think it's very unlikely. So you really need this tacit, very specific domain knowledge. So I don't see that path being followed yet. ARVIND NAYANAN: 100% agree. I think this is another area where AI developers really fooled themselves. I think there was misleading intuition from the last few years of rapid AI progress, whereby scaling up these models and by training them on bigger and bigger chunks of the internet, there were more and more emergent capabilities.[68] That approach has run out, and not only because they're already training on all of the data they can get their hands on but also because the new things that are left for these models to learn are exactly things like tacit knowledge. There is a way to learn tacit knowledge, but it is not in the passive way that models are being trained right now. It is by actually deploying models or AI systems, even relatively unreliable AI systems in small settings in different domains on a sector-by-sector basis, not in a general-purpose way, and learning[69] from those interactions with real users and real domain experts. This is the kind of positive feedback loop that we've had, for instance, with self-driving cars. The reason that it took about two decades from the first demonstrations of successful self-driving to getting them to a place where they're able to autonomously ferry people on the roads right now is that you have to slowly scale up. You drive 1,000 miles, and then you collect data. That allows you to improve your reliability. Now it's reliable enough to drive 10,000 miles,[70] and then you go to 100,000, et cetera. That's a very slow process. We predict that we're going to see that kind of slow feedback loop going forward on a sector-by-sector basis. DARON ACEMOGLU: OK. Well, I think that's a segue for the next question, which I wasn't sure whether we were going to ask, because you wisely stay away from AGI a lot in the book. I guess, here is the argument that many people have in their minds, which makes something like AGI a default position. At the end, the human mind is a computer.[71] Whatever substrates it uses, it's the computing machine of sorts. Well, we're going to build better and better computing machines. So therefore we'll go to AGI. So I think then any-- I think this is a sort of a bait and switch. Then it rather puts anybody that says, well, show me the money, in a defensive position. But if we were in that defensive position, either we would have to disagree with that scenario, or we would have to say, well, here are the bottlenecks that you haven't taken into account. And I would be curious to know whether you would completely[72] avoid being put in that position or you have something to say about the presumption or the bottlenecks. ARVIND NARAYANAN: Yeah. No, I'm more than happy to talk about it. These are certainly some of our more controversial views, at least within the tech community. DARON ACEMOGLU: Not here. ARVIND NARAYANAN: Yeah. So let me say two things to this. One, this has been consistently predicted throughout the history of AI for more than 70 years. When they first made what are called universal computers--[73] we just called them computers now-- but back then they were called universal computers because it was a very new concept that you could build one piece of hardware to do any task by programming it appropriately, as opposed to building special-purpose hardware for each particular task. The excitement around that was exactly similar to the excitement around general-purpose generative AI models today. They thought we've done the hard part. The hardware, it's right there in the name. And now we just need to build the software to emulate the human mind. And they thought that was only a couple of years away.[74] The pioneering 1956 Dartmouth Conference proposed a, quote unquote, two-month, 10-man effort to make very substantial progress towards AGI, which was just called AI back then. So over and over, while it might be possible in principle that we can have software do all the things the human mind does, AI developers have been so off in knowing how much the gap is between where things currently are and where we need to be. So that's one thing. The second thing-- I know we're running out of time. So let me very quickly say we talk about this[75] in the book in chapter five. In our view, human intelligence is not primarily a consequence of our biology but rather our technology, the fact that we've been using our technology for decades, for centuries, to learn more about the world. The most prized knowledge that we have that allows us to do things that we most associate with intelligence, whether it's medical testing or economic policy, these are things that we learned by doing large-scale experiments on people.[76] So we predict that very soon these computational limits are not going to matter. But the thing that is going to hold AGI back is not being able to easily transcend learning this knowledge from what humans have actually learned already and creating new knowledge for itself because that's going to require the same kind of bottlenecks that we ourselves have faced with experiments, scaling, ethics, and so forth. And we're not going to let AI do experiments on millions of us without any oversight. And so that is going to put very, very strong speed limits. DARON ACEMOGLU: Great.[77] That's an excellent point. I think we're running out of time. So I want to bring it, I think, to a topic that's actually much closer to our initiative, which is our concerns about whether we're doing the right kind of AI. So I think David, Simon, and I, all three of us, have these ideas-- some based on intuition, some based on empirical facts, some based on history-- that there is a more productive way in which you can develop AI, especially what we call pro-worker AI, which aims at increasing workers' skills,[78] expertise, productivity, create new tasks or capabilities to do more sophisticated tasks. And then we're worried whether we'll actually ever get there on the current path. And I guess you have very nicely cataloged various mistakes that people are making in terms of banking, or at least pretending that they're banking on AI that's unlikely to work. Or if it works, it's not going to be that great. AI hype-- perhaps that's leading to AI overinvestment. Perhaps it's leading to the wrong kind of AI investment.[79] But I guess at least I didn't see it, or perhaps I missed it, that next step in the book, which is therefore the wrong types of innovation, effort, R&D, et cetera, is being made. The wrong kind of startup energy is coming, and whether we can do anything about that other than, of course, inform the public, inform the policymakers with books and conversations like this. But is there a more sort of an agenda of that sort that would make you even more of a fellow traveler with us? ARVIND NARAYANAN: I think there is a little bit.[80] And I think that's exactly the critical question, and I'm so glad that you all are looking into that. And I think it needs perhaps economists more than computer scientists. But what I can say from a computer science perspective is that when we look at what companies are doing, just right now the market is not rewarding that. So an example of where this plays out is the use of AI by students. I know we're talking about labor, but I think it's somewhat similar. So the initial uses of generative AI were primarily for cheating or other things that are maybe not[81] the best ways to use them. We know there are good ways to use them. We know that AI, if properly configured, can be a very good tutor despite its limitations, such as hallucinations. I use AI very heavily for learning. I haven't stopped using books, but there are some advantages of using AI. Happy to get into that in the Q&A. But it's remarkable that only, I think, a couple of weeks or maybe a month ago that Anthropic came up with an AI tutor, which is just a simple customization of their model to be in a tutoring mode[82] where it's not just giving out answers to the student but rather promoting their critical thinking. And it's striking to me that took them so little work but it took them 2 and 1/2 years or whatever of people just constantly complaining in order to do. And so, yes, we can provide lots of technical ideas. But ultimately we need to either change the incentives for companies through regulation or have much more investment in other organizations, maybe NGOs who are going to develop these AI applications with the public interest in mind[83] instead of leaving it to the AI companies. DARON ACEMOGLU: Thank you very much, Arvind. I think that's a great time for us to transition, because I'm sure many people have burning questions for you. The way we're going to organize this is there are two mics over there. Those of you who want to ask questions, if you don't mind lining up, and then we can take one from each side in alternating order. Why don't we start on the right? AUDIENCE: Thank you very much. A wonderful talk. Just as a quick layman's questions as a user for AI[84] for over a year, my question is that how much you can tolerate the error that generated from AI giving you the answer. For instance, recently I tried to ask AI a question of a citation or quotation from a famous person. And then I get the answer. And then I post it on my blog, and I got embarrassed because I asked the professor. He said that I never say such statement. So the AI created or imagined that person x say something like this. I would say that AI give a lot of convenience, pattern recognition, and very convenient.[85] But I would say 10% or more they give you some error mixed with your correct answer. ARVIND NARAYANAN: Thank you. Yes, hallucinations are a big problem with generative AI. These are fundamentally stochastic technologies. Even if we could somehow clean all the training data and make sure that you only train it on true statements, the problem would remain because at generation time it is kind of remixing the statistical patterns in its training data. The hallucination rates have gone down quite a bit over the last couple of years, but they're not zero.[86] I don't think they're going to reach zero in a very short period of time. And I have also had the experience of people emailing me to ask for my papers. And they're like, hey, where is it? I couldn't find it online. Turns out it was made up by AI and attributed to me. [LAUGHTER] So what we tell people to do is to not just think about using AI generally in your work but identify specific areas of your workflow. And in each of those uses of AI, you have to have an answer to why is it easier to verify the answer to this question than to have done this work myself[87] in the first place. And if you don't have an answer to that, don't use AI. And if you do have an answer, it might save you time or enhance your creativity, as the case may be. AUDIENCE: Thank you. AUDIENCE: Thank you. Professor, I wanted to bring back again this topic of the fairness, or maybe bias of the algorithms. About six months or a year ago, Sam Altman came. And when someone asked him about the bias of the algorithms-- you can say that, for example, in the criminal justice system-- he said that it's easier to change the algorithm[88] than the bias from the humans. And I said, like, OK, maybe that's compelling. To be honest, for me, it was like, OK, understandable. What do you say about that? What do you think? Is it possible? Do you think it's better? Who sets what the bias is? Yeah. ARVIND NARAYANAN: Yeah. Thank you. It's a good debate. Sendhil has also made that claim. He had an op-ed in "The New York Times" literally saying it's easier to fix biased algorithms than biased humans. And I very much see that point of view.[89] I have a slightly different view. I think in theory it's certainly possible to fix certain kinds of biases in algorithms. It might just be a matter of changing a parameter in the code. And there are many computer science techniques to do that. The problem is not technical. The problem is one of incentives, and transparency, and things like that. A lot of the time, you don't have transparency from the companies who are building these criminal justice algorithms. They might say it's proprietary, it's a trade secret. So in the case of Compass, even though it's been about a decade[90] since that investigation, no changes have been made because actually you cannot fix it without introducing disparate treatment. If you were to fix it in the algorithm, you have to have different weights or different treatments, different thresholds for different categories of people. And that actually would violate the law. And these are things that human judges account for in a very subtle way when they're making their decisions. But when we're trying to do them in algorithmic systems, we have to do them in very crude ways, which,[91] even if theoretically possible, end up not being practically possible because of various constraints. AUDIENCE: Thank you. AUDIENCE: Hi. Thank you for the book and the conversation. So the question I wanted to ask-- kind of in the book and in the talk, there's this kind of general statement that it's like many predictive AI applications are unlikely to work and there's hope for gen AI. And I want to ask basically, is that predominantly a statement about, from your perspective, the underlying technologies or the settings in which those technologies are employed?[92] The reason I asked that is because there are certain-- I work on AI for climate-change-related applications, right? There are certain settings like solar power prediction, where you could use a predictive model or a generative model. And also there are some trends from some of the large technology providers now, where there's this statement of, like, we should invest in generative AI and large models because it's going to solve climate problems. And that leads to an investment in gen AI rather than other techniques, and all of the energy consumption,[93] and concentration of power that comes with that. And the statement of, like, gen AI is better than predictive AI can lead to those kinds of issues. Yeah. Circling back to the statement about the technologies or the settings. ARVIND NARAYANAN: Yeah. Thank you. That's a great question, and it's exactly the latter. It's a statement about the settings. It's about the particular applications that we're using these technologies for. Even if you were to take a generative AI model and use it in criminal justice, exactly the same list[94] of our objections would apply. And they don't apply in the solar power prediction setting, because you're not making consequential decisions about people that have massive ethical consequences. So, yeah, I'm totally with where you're coming from with that question. AUDIENCE: Thanks. DARON ACEMOGLU: Please. AUDIENCE: You said something about the predictions from people in the field about how soon AI would reach certain levels has a terrible track record. Let me suggest that's a sample bias, because all the bad predictions get all the press.[95] You never hear about the fact that somebody once asked John McCarthy what it would take to get really good artificial intelligence. And he was annoyed by the question, so he gave a somewhat whimsical answer. But he said "1.3 Einsteins," and he went on from there. That's not widely quoted, because it's not nearly the kind of thing you can make a big laugh out of. So let me caution you on that. ARVIND NARAYANAN: Thank you. I appreciate that. I should clarify I had a somewhat superficial presentation at that point when I said it just now.[96] But there's a deeper point behind it, which is not so much that the founders were wrong, but they were wrong for fundamental structural reasons in that they were-- I'm not blaming them-- they were not able to see what are the steps in the ladder, if you will. We use the metaphor of a ladder in our book to discuss progress in AI. When you're standing on one step of the ladder, we claim it's impossible to know what the future steps on the ladder are. So it's really about that deeper point, but thank you. I appreciate the correction.[97] AUDIENCE: And one other quick point. You got what I'll call an inexpensive laugh out of the idea that the predictive programs and their AUC was only about 70. So they weren't much better than flipping a coin. You have to ask about the baseline. How good were the people doing this task? Because if the people doing this task are at 55%, then 70% is pretty damn good. ARVIND NARAYANAN: Yeah. [APPLAUSE] OK. [LAUGHTER] All right. So the people were exactly at the same level as the algorithm and not even trained judges--[98] just laypeople. And not only that. Whatever you can accomplish with state-of-the-art algorithms, you can get with a two-variable regression model. And those two variables are the defendant's age and the number of prior arrests. And we talk about why the use of the age is actually morally problematic. So, essentially, the logic behind these systems is if you've been arrested a lot in the past, you're going to be arrested a lot in the future. That is the entire thing. And we actually say that we would actually[99] be much happier with a system where that was the hard-coded logic, because it would be apparent to everybody, especially the defendant, what is actually going on. So I'm with you that we have to look at the right baseline. But here we have a 40-page paper looking at the right baseline. And it doesn't look good for the algorithm. Maybe I'll go to another question. DARON ACEMOGLU: We have only less than three minutes left. So please very short questions at this point. AUDIENCE: I had a question about how you calibrate investment[100] across different types of technologies. So maybe you have one type of technology that with some very small probability will yield a big return and something that is likely to work but will not change the world. So how do you think about allocating investment between these different technologies? ARVIND NARAYANAN: Yeah, definitely. So this relates to an issue that we call herding in the AI community. And in every research community, there are fashionable ideas. People cluster around them. It's hard to compare those between fields.[101] But just kind of based on vibes, it seems like there is more of this going on in the AI community than in most other research communities. So today, all of these fancy generative AI systems are based on neural networks, which were sidelined for more than 20 years because people thought that they were completely outperformed by another technology called support vector machines, which people would laugh at in today's context. Right? So why did that happen? There wasn't enough diverse exploration of different paths.[102] So I do think it might be hard to compute the return on investment. But it seems clear that there needs to be more of a risk preference, if you will, and diversifying the set of research ideas we invest into. DARON ACEMOGLU: Please. AUDIENCE: Yeah. Since the 1980s, I think we've seen an ever-increasing gap between productivity and wages. I think there's a big argument for the stagnation of wages basically just occurring probably due to a number of factors but including computation kind of replacing and abstracting[103] a lot of the work. How do you view maybe the impacts of something like AI maybe impacting the productivity of a worker and how that might also affect the wages? And how can we correct that? ARVIND NARAYANAN: That's a question for you, I think. [LAUGHTER] DARON ACEMOGLU: But they don't want to hear from me. We want to hear from you. ARVIND NARAYANAN: No, no, no, no, no. Please go ahead. I want to hear from you. DARON ACEMOGLU: [LAUGHS] Well, this is what we were-- we spent our good chunk of our waking hours not just automating work, which will, of course, happen[104] and should happen, but also finding AI uses that will increase the information and the capabilities of workers to deal with more complex things. But how to get there is the real challenge. One last question. AUDIENCE: Oh, yeah. Thank you so much. I just wanted to ask. I think I've seen lately in the public there is a lot of fear and backlash against AI. And I wanted to know your thoughts on what might be contributing to that and also how people in either tech research or tech industries-- how they can address those[105] fears. ARVIND NARAYANAN: Definitely. Lots of thoughts on this, but I know we're running out of time, so let me keep it short. And, yes, you're absolutely right. Many more people, according to public opinion surveys, are worried about what AI will mean for them than are excited about it. And I think this is almost entirely a statement about capitalism than it is about AI. It varies a lot between different countries based on the kinds of worker protections that people have come to expect, et cetera. It dramatically moderates their reaction[106] to these exciting/worrying technological developments. And in terms of what can be done about it, I think there's a lot of room to improve the channels of communication between the AI industry and research community with the general public. We've talked a lot about how in many ways people in general, workers in different domains, have a much better understanding of AI's potential and limits in their particular application, like law, medicine, or whatever, than AI developers do.[107] And so AI developers would benefit a lot from understanding that and not making these overhyped claims. But at the same time, I think people deserve to understand why is it that companies are confident enough to make these trillion-dollar bets, understand new emerging capabilities, which, frankly, almost feels like a full-time job to stay on top of. I think companies can do a lot to ease actual public understanding as opposed to just hyping up capabilities. So I think communication could improve in both directions. AUDIENCE: Thank you so much. DARON ACEMOGLU: Well, thank you very much, Arvind.[108] That was wonderful. [APPLAUSE] I'm just going to add that it's a testament to how interested people are. They could have stayed here for another half an hour, but thank you. [APPLAUSE]}